Some words have more importance and relevance than others. In the decision-making context, there may be no more important word than “because.” The word “because” marks a line between a conclusion and a premise. For example, “We chose a specific messaging or database application because….” In this issue's CODA, I'm going to address one of my favorite topics: the process of how we go about deciding important matters and how we support those decisions.
In almost any technology decision of any importance, there'll be at least a few meetings, more than a few long email and chat threads, some arguments, more debate, and lots of going back and forth with the mix of facts. The team must sort through all of this as a group and as individuals in order to plot solutions and decide in the context of business risk, plus constraints and objectives, which alternatives are better than others, and how it's all ranked.
How do we go about that? How do we decide? And more specifically, how do we objectively make decisions when we're surrounded by so much data, some of which is presented in a subjective and skewed manner? Just like everything else in our world today, there are patterns, practices, frameworks, and rubrics for decision-making that can provide guidance. If you're thinking that subject matter areas like DevOps and Agile apply, you're correct.
But there's something more fundamental that we must rely upon for guidance. That fundamental thing is logic. And for illustrative purposes, I'm going to use the legal world as an analog.
In the law, basic guidance comes from the law itself in the form of statutes, ordinances, and regulations, and how those are applied in the context of facts, to determine liability or guilt. Which facts are more important and relevant than others? Which facts are more determinative to resolving the matter? Every legal proceeding is about resolving materially disputed facts in the context of applicable law. If there are no disputed facts, there's no need for a proceeding. We simply fast forward to applying the facts to the law and then decide. In fact, that's what a motion for summary judgment is all about. In the prelude to a trial or litigation, there's also discovery where we gather, assess, and analyze facts. We depose, submit written questions, subpoena documents, and assemble related information. If the issue needs to stand trial, we litigate the facts in the context of the law under the rules of procedure.
The commissioning, development, and delivery of a software project is substantively not that different from a trial. We march toward a solution, encountering facts to assess and analyze along the way. We ask for accommodations, some of which we may get in whole, in part, or not at all, based on the organization's policies and procedures. In a trial, motions for various kinds of relief are made often. The most common one is when an attorney objects to testimony or to the introduction of evidence. Everything stops at that point, pending the judge resolving the matter. Typically, the judge addresses the other attorney, asking for a rebuttal. This is what argument is all about. It's important to understand what argument is, what it isn't, and when it's appropriate.
Technicians and engineers tend to be people with very definite opinions about certain matters. For example, one team member may have an extreme preference for a certain open-source library. Others on the team may have an opposite opinion. Yet others may be completely indifferent. Hashing these differences out where the best alternatives for the organization may be further reviewed is a necessary activity. Depending on the personalities, some of this “hashing out” may be smoother and more conflict-free than others. This is where subjectivity and skewed facts can enter the process. Undue bias in decision-making often leads to an unintentional assumption of risk because facts aren't seen for what they are. Rather, they are seen for what some or all wish they'd be.
Words like argument and debate are often viewed as negative things because too often, technical debates turn into religious disagreements where each participant comes from a very personal space. Argument SHOULD be just an exchange of opposing views as to facts. Which facts? Which views? How are those views formulated? In other words, if one concludes and responds in kind that “Yes, our API can and will scale to the planned increased sales volume without modification,” it begs the question of “Why?” Why is it capable of scaling? Was it designed to scale? Has it been tested and verified?
It's those additional things that stand apart from the conclusions that can be objectively reviewed. In other words, a conclusion without a sufficient premise is just a guess, out of thin air. The guess may indeed end up with success. What's the likely percentage of success? We can guess, but there is no objective basis for others to assess and analyze the facts. The point of the entire exercise is to objectively assess facts that are likely to lead to the best outcome for the organization. It may very well be that what was thought to be the best alternative isn't, in the end. That would be regarded as a failure. Therefore, to improve and better support the organization, it becomes necessary to understand why failure occurred. In other words, we must be able to say, “The process failed because….”
The critical piece to making everything work is a platform where people can collaborate on analyzing and assessing the facts. The primary constituent in this case is the business and its operations, not our individual egos. Cultures of success not only tolerate dissenting points of view, but dissent is openly encouraged. Any person making a conclusion or offering a suggestion has the burden of proof to reasonably support their position in a way that facilitates group understanding. It's unreasonable to expect others to know and experience what each participant is thinking. Success is all about the necessity of transparency.
Imagine there was no transparency in a legal proceeding. There'd be no understanding of why liability or guilt was assessed. In a typical legal proceeding, there are two important devices, the docket and the transcript. The docket is an enumeration of every document, motion, hearing, order, etc. The transcript is the record of what was said and what happened during the trial or other legal proceeding. In any appeal in support of an assertion of error, the transcript is essential because it's the only way the reviewing tribunal can know what happened, and, in light of the rules, assess whether the earlier decision reached was in error and should be reversed.
If you had to replay the tape to determine why a certain failure is being countered, could you?
The first thing you need is the tape itself. Is there any record? Or is it all tribal knowledge held in individual team members' recollections? Is your organization subject to SOX, SOC, HIPPA, PCI, etc., audits? Have you ever had to respond to such audit requests? Perhaps you've implemented tools related to Continuous Integration/Continuous Delivery (CI/CD), Agile, or some other aspect of DevOps. Assuming that some record exists, are its contents verifiably complete and accurate? Can you peel back the onion layers to get to the core? Can you engage in timely, efficient, and successful root-cause analysis such that you can make informed conclusions based on a verifiable context (data, environment, or any other relevant thing)?
Verifiability is precisely what an audit is all about. Is the published application based on the last published specification? If a sufficient record, premised on sufficient environment, exists, your deployment process should be able to quickly respond pursuant to a standard operating procedure: “Yes, the latest published application is based on the latest spec because….”
It's an easy question, and one that should be easy to answer with as little effort as possible. For anything audit related, despite its importance and necessity, strive to make asking and answering such questions as mundane as possible. Otherwise, the more effort and energy spent on the mundane, the fewer resources available for new feature development.
Here are some quick tips to improving your shop's ability to address those words that come after “because:”
- Meet as a team to get consensus on how decisions will be made. You're not pre-determining what the decisions will be. You're finding out what your rubrics for decision-making are. Set out clear rules for engagement in argument and debate, that such things should never be personal. Check egos at the door and hold each other accountable.
- Adopt a Definition of Done (DoD) and agree that it's inviolable. Just as there's a process for how decisions are made, among the many alternatives, there's a software development process that will work well for your shop. It's far better to hold something from deployment for quality reasons than to let it pass and ruin reputations, lose money, and create problems for yourselves and your customers. That's how technical debt unreasonably and unnecessarily accumulates. The entire point of stopping problems before they start is to better manage technical debt accumulation.
- Adopt a two-prong test when analyzing facts: relevance and probative value. Here's another borrowed item from the law: Relevance is about how “closely connected” one thing is to another. How close something gets to being relevant is a question for the entire team. Irrelevant facts are distractions. This is an important area because if there's disagreement about relevance, that means one of two things: somebody is creating a dependency relationship when there's none, or there's an unacknowledged dependency in the system. The second testing prong is probative value. In other words, can it be verified? Is the fact what it purports to be? Is it credible? Scrutinizing facts can be time-consuming and difficult, but facts are the foundations of your decisions. It's important to disregard facts as quickly as possible that are either not relevant or have no probative value.
- It's important to have and maintain a record of what happened. When it comes time to be able to demonstrate what happened to some third party, is there a system you can rely upon? A credible auditor can't just take someone's word for it. Remember, the auditor is assessing the facts you supply for relevance and probative value too. There are many tools that support processes concerning how we build software, such as Azure DevOps, GitHub, Atlassian, Jenkins, Team City, Rally, etc. As tools, they rely upon people and processes to work correctly and deliver the expected benefits. There's perhaps no other area so rife with debate than tools, their selection, which is best, how best used, etc. The point here is that it's not only important to log your application's activities. It's just as important to log how your team decides things, so you can demonstrate later what happened in the past. Without a discernable record, any response to the question of what happened is, at best, just an opinion of what individuals recall. They may be accurate or entirely off, you can't know.
- Embrace Dr. W. Edwards Deming's research and actively look to eliminate dependencies on manual inspections. The more that can be systemized and automated, the less it's subject to the variable of human intervention. With systemized consistency, the more resilient your processes can be and, perhaps more importantly, the more comparable they become through time because you've recorded how the automated processes have modified over time. These automations are at the heart of unit testing and continuous integration and deployment (CI/CD) pipelines today in modern DevOps implementations.
If you aim to improve quality, you must understand the causal factors of success, which are learned through understanding the causal factors of failure - the “because” of it all. You need to be able answer “We succeeded because…” and “We failed because…” This can be a long and daunting journey but it's not only worth the time and effort, it's also necessary.